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Abstract A type of iterative orthogonally accumulated projection methods for 
solving linear system of equations are proposed in this paper. This type of meth¬ 
ods are applications of accumulated projection(AP) technique proposed recently 
by authors. Instead of searching projections in a sequence of subspaces as done in 
the original AP approach, these methods try to efficiently construct a sequence 
of orthonormal vectors while the inner-product between the solution to the sys¬ 
tem and each vector in the sequence can be easily calculated, thus the solution 
can be retrieved in hnite number of iterations in case of exact arithmetic oper¬ 
ations. We also discuss the strategies to handle loss-of-orthogonality during the 
process of constructing orthonormal vectors. Numerical experiments are provided 
to demonstrate the efficiency of these methods. 
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1 Introduction 

The study of iterative methods for solving least square problems in the form 


Ax = b (1) 

where A G especially for large scale computing is of vital importance. Here 

we always assume A is nonsingular so that there exists a unique solution to the 
system. There are a lot of iterative methods available [Tll^l^ISlI^ITO] for solving 
system 0. 
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Recently all current iterative methods are classified as extended Krylov sub¬ 
space methods in [14] , which are characterized by their major operations: matrix- 
vector multiplications with usually one or two fixed matrices and one or two 
fixed initial vectors. These includes the most well-known stationary methods such 
as Jacobi, Gauss Seidal as well as SOR methods with their iterative matrices 
formed on the base of splitting the coefficient matrices [8], and the row projection 
methods such as Karcmarz’s method and Cimmino’s methods where the itera¬ 
tive matrix (not explicitly formed in iterations) are constructed by the successive 
multiplications of a sequence of projection matrices with a hxed sequence length 
m(depending on the splitting of the coefficient matrix into m submatrices [4] [7]). 
The non-stationary iterative methods include the well-known Krylov subspace 
methods such as conjugate gradient method(CG) for symmetric positive definite 
systems, MINRES, SYMMLQ for general symmetric but indefinite systems, and 
GMRES, BiGG, BiGR, QMR, LSQR, etc. for general nonsymmetric systems[6l[8l 
[TT |[I6|[T7] : many of these methods(including GMRES, MINRES, SYMMLQ, MIN¬ 
RES, QMR, LSQR) use the strategy of reducing some related residual norms to 
search for approximate solutions, while variants of GG and BiCG methods use the 
strategy of producing a sequence of orthogonal residuals, thus they can reach the 
exact solutions with n iterations in exact arithmetic operations, where n is the 
number of unknowns [8]. 

In |14] authors also presents a first non-krylov subspace type methods-The 
Accumulated Projection Methods. These type of methods rely on successive pro¬ 
jections over subspaces of which produce a sequence of projection vectors with 
a monotonically increasing Euclidean norms. Unlike the well-known row-projection 
technique which can be shown as a traditional stationary iterative methods [7], 
the AP methods proposed in[14] do not involve matrix-vector multiplications with 
any fixed matrices and fixed vectors. Equipped with some accelerating technique, 
the AP methods exhibit some superior behavior than traditional extended Krylov 
subspace methods [14] in some cases. 

The success of AP methods rely on the calculation of projection vector of 
exact solution x G over a sequence of subspaces of {k = 1,2,-•• ,m), 
where is formed by the row vectors of coefficient matrix and the most recent 
approximations pk of x. The calculation of these projection vectors are based on the 
QR factorization of matrix for general matrices, or QS[13] decomposition of 
if the coefficient matrix is sparse. Generally speaking, the QR factorization needs 
0{m^n) flops and is thus a heavy burden if a long iteration is needed, current LGO 
decomposition requires that the coefficient matrix satisfies some special property 
(for example, /c-orthogonality) and its implementation is quite complicate. One of 
our purpose in this paper is to provide a more efficient way to handle the projection 
of any given vector into a subspace of B^ with much less float point operations. 

Our major task in this paper is to provide a class of methods based on the 
principle of accumulated projection to handle linear system of equations. For the 
sake of completeness, we are to briefly review the principle of accumulated pro¬ 
jection technique and its applications in the next section. The other sections are 
devoted to discuss the exploration of AP technique in a more intricate way which 
leads to a series of algorithms for solving linear systems. 
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2 Principle of AP technique 

Now we review the basic idea of accumulated projection methods. To approximate 
any vector x in , one has to construct a subspace W of R!^ with a much smaller 
rank than n so that a “projection” vector p of x is easily available. Current preva¬ 
lent methods depend on the strategy of reducing the length of residual vectors 
to obtain such a projection. While only a few methods use the regular orthog¬ 
onal projection to get approximate vectors, which include the so-called General 
Error Minimizing Method (which is similar to GMRES method) [5] and the Line 
Projection method proposed in [12], both can be classified as extended Krylov 
subspace methods since both of them depend on certain Krylov subspace from 
which a projection vector is sought. To be able to figure out the projections of 
X over subspace W, one has to get some “footprint” of x over IT, for example in 
GMRES-like methods a basis vectors of W in the form of A^h with b as image of x 
under the transformation A are required, while in GMERR and LP methods, the 
inner-products between x and a basis of W are available. By this observation 
we can derive another class of methods for solving linear system of equations using 
orthogonal projections. 

The basic idea of AP is to use the orthogonal projections of vector x as its 
approximations, while each projection is used to form another subspace from which 
a better approximation is sought. The following graph can be used to illustrate 
the whole idea, where Xi stands for the approximations to x and at are projection 



Fig. 1 Accumulated Projection 


vectors of x on some subspaces of R^. is the projection of vector x in a 

subspace Wi formed by Xi and a subspace Wi where projection vector of vector 
X is easily available. 

The following algorithm describes a simple implementation of the accumulated 
projection idea, where vector ai is orthogonal to vector x^. 
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Algorithm 1 (accumulated projection process-AP) The following procedure produces 
an approximate vector p to the solution vector x which satisfies Ax = b. 

(1) Divide matrix A into k blocks: A = \A\ .A'n, ■ ■ ■ .A'A' divide b correspond¬ 
ingly: b = {b[,b 2 , ■ ■ ■ ,h'^y. 

(2) Initialize po as po = aA'h and cq = aWbW^, where a = \\b\\^/\\A'b\\^. 

(3) For i = 1 to k 

(3.1) Construct matrix Wi = and vector I = [ci-i,b[]'. 

(3.2) Compute the projection vector pi of x onto subspace ran{Wi) 
and the scalar Ci{= x'pi). 

(4) Output p{= pk) andc{=Ck). 

This algorithm formed the basis of some more efficient solvers for linear system 
of equations such as SAP and MSAP and APAP methods introduced in [15] and 
[14] . It is observed that these methods seem to be more efficient than regular 
Krylov subspace methods in case of large scale systems in some situation. It is 
necessary to mention that these methods do not construct any Krylov subspace 
methods and thus can not be classified as extended Krylov subspace methods. 
In this paper we will show that the AP process can also be used to construct a 
class of Krylov subspace methods, named as orthogonally accumulated projection 
solver (OAP). 


3 An orthogonally accumulated projection through tridiagonalization 

In this section we will consider to solve system 0 with a unsymmetric coefficient 
matrix A. The main idea is to transform the original system 0 into a system 


Qx = c (2) 

where Q is an orthogonal matrix, i.e., Q'Q = I where I is the identity matrix. In 
other words, we will search for a sequence of orthonormal vectors Vi (i = 1,2, ■ ■ ■ ,n) 
and real numbers (i = I, 2, ■ • ■ , n) so that x'vi = q, and thus x can be taken as 
In the meantime we do not have to spend too much extra storage space 
to store all vectors Vi{i = 1,2,-■■ ,n), instead we will show that a short length 
recurrence relationship occurs between contagious orthogonal vectors so that only 
a few extra storage space for these vectors is needed. 

In order to figure out how this will work, let us review the principle of AP as 
illustrated in Figure Q. In general the sequence of projection vectors come from 
some predetermined subspaces and thus they are not necessary to be orthogonal. 
However it is possible for us to work out a way so that all of these projection 
vectors (z = 1,2, ,) form an orthogonal sequence. To be complete, we first 

recall the Laczos iterations for tridiagonalization of a rectangular matrix. 


3.1 Matrix tridiagonalization by orthogonal transformation 

Any matrix A G transformed into the following tridiagonal form 


U'AV = T 


( 3 ) 
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where T is tridiagonal 


/ ai 

Pi 0 ■■ 

• 0 

0 \ 

71 

a2|32■■ 

• 0 

0 

0 

0 0 •• 

■ Oin — 1 Pn—l 

V 0 

0 0 •• 

• 7n-l Ctn / 


both U and V are orthogonal, i.e., U'U = V'V = Im- This transform can be 
accomplished in a rather stable way by applying Householder transformations on 
both sides of A. However when A is sparse and large, we can expect dense and 
large submatrices to appear in this process, which makes it not suitable in large 
scale computations. 

Fortunately a Lanczos-like process can be used to do the tridiagonalization in 
a much cheaper and efficient way. To illustrate this we rewrite equation into 
the following forms 


AV = UT 

( 4 ) 

and 


A'U = VT' 

( 5 ) 

Equating k-th column of both sides of Q and ([^ we have 


^l^k — 7fcllfc+l T CK/cllfc T Pk—l'^k—1 

(6) 

and 


A Uk = PkVk+i + OLkVk + ^k-iVk-i 

( 7 ) 

with /3o = 70 = 0 , where Vk and Uk denote the fc-th columns of 
U separately, vq and no = 0 are zero vectors, i.e, V = [v\,V2,-'' 
[u\,U 2 , - ■ ■ , Un\. Especially we have 

matrix V and 
,Vn] and U = 

Av\ = aini + 71^2 and A! u\ = + P 1 V 2 

( 8 ) 


which suggests that if both ui and vi are given, then V 2 and U 2 can be calculated 
simultaneously. The rest vectors and for k > 3 can be calculated by rewriting 
and 0 as follows 


{-^^k ^k^k Pk—l'^k—l') 
Ik 


and 


'^fc+l p '^k ^k'^k 'yk—l'^k—l') 


The following algorithm depicts the above process. 


( 9 ) 


( 10 ) 


nXn 


Algorithm 2 Let A ^ R 


vi and ui be unit vectors. 
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/3o = 70 = 0; VO = uo = 0(g i?”) 
for k = 1 to n — 1 

Wk = Avk - akUk - Pk-iUk-i 

Ik = llw^fcll 

Uk + l = Wk/jk 

Qk ^ ^k'^k lk — l'^k—1 

/3fc = Ikfcll 

^fc+l Qk/h 
end 

In exact arithmetic operations the above Lanczos-like iteration will produce 
two orthonormal vector squences v\,V 2 , -'' Vn and ui,U 2 , -' 'Un with any starting 
unit vectors ui and vi, assuming no break-down happens(i.e., 7 /c 7 ^ 0 and Pk ^ 0 
for all k). Note that in each loop in the iteration one needs only two matrix-vector 
multiplications as its major flop counts, this makes it very effective when dealing 
with tridiagonalizations of large and sparse matrices. 


3.2 orthogonally accumulated projection 


We have observed that in basic AP algorithm to make sure next approximation 
Xk+i is a better approximation to x(the exact solution) than Xk, a projection on a 
subspace which contains Xk must be done, which guarantees that ||efc_|_i|| < ||efc|| 
where Ck = x — Xk is the error vector associated with Xk- However if at can be 
constructed in such a way that they always satisfy 

Xk -L ai for i > k, ( 11 ) 


there is no need to do the extra projection to get the next approximation Xk+i, 
instead one can simply obtain Xk+i by Xk+i = Xk+ai. Obviously if vector sequence 
{ak}i forms an orthonormal sequence of vectors in R^, and let Xk = ciai 

where q = x'ai, then it is easy to see that ( |TT| holds true. This is exactly the 
principle of orthogonally accumulated projection(OAP). In other words, to solve 
system Q, OAP method builds a sequence of orthonormal vectors {r’iji as well 
as sequence of {ci}i, the inner-product between x and each of Vi, i.e., Ci = x'vi, 
thus X can be retrieved as x = 

We will shown in next section that in exact arithmetic operations. Algorithm 
will produce a sequence of orthonormal vectors {vi}f^i; in order to find the 
inner-product Ci between x and each Vi, we multiply by x both sides of equation 
( [To] ), this leads to 

Cfc + l ^ CX-kX Vk 7A: — 1 ) 

( 12 ) 

/377fc—iCfc— 1 ) 


since Ax = b, particularly we have C 2 = — aici). This implies that if ci 

is known, then all the other subsequent Ci{i = 2,3, •••«) can be calculated by 
(12). These process can be described in the following algorithm, which is called 
orthogonally accumulated projection for solving linear system of equations. 
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Algorithm 3 (orthogonally accumulated projection method-OAP) Let A in be 

an unsymmetric and nonsingular matrix and b G a non-zero vector. Let vi and ui 
be two unit vectors and ci = x'vi he given, where x is the solution to The following 
process gives the exact solution x to system Ax = h. 

xi = civi,l3o = 70 = 0, Vo = tto = 0 (g i?"") 

for k = 1 to n — 1 
Oik = u^Avk 

Pk ■^'^k Olj^Up,, — i 

Ik = \\Pk\\ 

Pk/^k 

Qk ^ ^k OLf^V].„ 7A:——1 

Pk — I lO'fc 11 

'^k-\-l Qk/ Pk 

Cfc+l = - OiCk - -Jk-lCk-l) 

^fc+l ^k A' 

end 

Note that there are only two matrix-vector multiplications involved, and storage 
for extra four vectors is needed besides that for the coefficient matrix A. In case 
A is sparse(having an average of m none-zero elements in each row) and large, 
the flop counts for one sweep of the loop is 0{mn). Therefore in exact arithmetic 
operations, there are only 0{vnPn^) flops needed for the whole procedure. 

Remark: there are many options for the initial vectors vi while ui can be 
chosen arbitrarily. For example any row vector Ai of matrix A can be used for 
constructing vi (vi = A'/||Aj|| with ci = hi/\\Ai\\. Another type of options is any 
vector in the form vi = tA'w where w is any none-zero vector and t is a scalar 
such that vi is a unit vector, and in this case one can see that ci can be obtained 
as Cl = tb'w. 


3.3 Analysis of OAP 

In this section we discuss some properties of OAP as a direct method(in exact 
arithmetic operations). Note that any unsymmetric matrix can also be transformed 
by Householder transformation into tridiagonal matrix T{T = V'AV) with V as 
orthogonal matrix, which suggests us to develop a similar algorithm for this type 
of transformation. However it turns out such a Lanczos-like iteration does not exist 
at least for arbitrarily chosen initial unit vector vi. It is thus necessary to verify 
the orthonormality of the vectors sequences {v^ji and in Algorithm]^ 

Theorem 31 Let A he unsymmetric and nonsingular, b G i?"' and x is the solution 
to Ax = b. The vector sequence (k = 1, 2, • • • , n) and (k = 1,2, ■ ■ ■ , n)produced 
in Algorithm^ are orthonormal, assuming no breakdown happens, i.e., Pk tA 0 o.nd 
Ik for any k = 1,2,3,-■■ ,n- 1. 

Proof. Apparently all vectors Vi and Ui {i = 1,2, - ■ ■ ,n) are unit vectors. We 
first show that V 2 V 1 =0 and U 2 U 1 = 0. 

Note that 


V 2 V 1 = 0 77 {a'U\ — Q!lVl)^Vl = 0 77 Ol = v'iA'ui 


(13) 
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the last equation is exactly how ai is constructed in the algorithm, hence we have 
V 2 and vi are orthogonal. Similary we have 

U 2 U 1 = 0 {Avi — aiui)'ui = 0 ai = v'iA'ui (14) 

By induction, we assume ui,U 2 ,--- ,Vk and ui,U 2 ,--- ,Uk are orthonormal se¬ 
quences of vectors, we need to show that = 0 and = 0 for i < A;. 

In fact 

'^k-\-l^k — 0 — U}^ — 0 Ofc — Uj^Av]^ 

and 

= 0 {Avk - akUk - Pk-iUk-iYuk-i = 0 
I3k-i = u^_^Avk 

Pk—l — i^A Ufc — i) U/j 

^fc-1 = v'k{A'uk-i) 

^ ^k-l = v’ki^k-lVk - Olk-lVk-l - 'yk-2Vk-2) 
f^k—1 — Pk — 1 

For i <k — 2 we have 

u'kJ^iUi = 0 {Avk - akUk - l3k-iUk-iyUi = 0 
v'k{A'ui) = 0 

v'kiPiVi+i + aiVi A- 7i_iUi_i = 0 

The last equation holds true since by assumption we have Vk are orthogonal to Vi 
for any i < k. Similarly one can prove u'k^iUi = 0 ior i < k. □ 


3.4 Control of loss of orthogonality 

There are several well-known Krylov subspace methods based on Lanczos iter¬ 
ations. The most famous method might the the wide-spread conjugate gradi¬ 
ent method(CG)(by Hestenes and Stiefel). Other effective methods include MIN- 
RES, SYMMLQ and LSQR(by Paige and Saunders), BiCG(by Fletcher) and 
BiGGstab(by Van der Vorst) and QMR(by Freund and Nachtigal), etc. All of 
these methods (except CG) adopt the strategy of minimizing certain type of resid¬ 
ual norm in related Krylov subspace. 

Unfortunately Lanczos process often suffers severe loss of orthogonality, which 
explains the possible instability of most of the above Krylov subspace methods 
based on Lanczos iteration. It seems that there is no effective way to handle this 
issue in general. Krylov subspace methods based on Arnold! iteration(such as 
GMRES) seems to be more stable but they usually need more storage requirement 
and flops in each iteration and thus usually have to be restarted. 

Krylov subspace methods based on minimizing residual norms usually ignore 
the issue of loss of orthogonality. However it is vital to our orthogonally accumu¬ 
lated projection method. Fortunately we have an easy approach to detect whenever 
loss of orthogonality happens. Our approach is to make sure in every iteration the 
“accumulated” vector ak+i is guaranteed to be orthogonal to current approxima¬ 
tion Xk- Note that Xk is a linear combination of vi,V 2 , • • • ,Vk and ak+i = Ci_|_iUjt+i 
(with Ci_(_i a real number) is supposed to be orthogonal to all Vi for i < k. Thus the 
angle between Xk and Vk+i a is good indicator when loss of orthogonality occurs. 
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And whenever loss of orthogonality happens, we restart the OAP process on the 
residnal equation = Ae^ where = b — Ax^ and = x — x^. This leads to the 
following algorithm. 


Algorithm 4 (Restarted orthogonally accumulated projeetion method-ROAPS) Let A 
in be an unsymmetrie and nonsingular matrix and b G R^ a non-zero vector. Let e be 
a given tolerance. The following procedure produces an approximation to the solution x 
to system 0. 

err = l,r = h\x = 0(g i?"") 
while err > e 

t = I |AV| I, = A'rjt, Cl = b'r/t, ui = vi 
XI = civi,/3o = JO = 0, VO = uo = 0(g i?"') 
for k = 1 to n — 1 
Oik = u'k^Vk 
Pk Oi]^U]^ — 

Ik = \\pk\\ 

Ukjl = Pk/lk 

Qk ^ ^k Oi^Vk Ik—l'Ok — l 

Pk = Ikfcll 
^k-\-l Qk/ Pk 

Ok-\-l Pk OlCk Jk—lO-k—l) 

e = cos~^{xk'vk/\\xk\\) 
if\'x/2-e\ =0 

Xk-\-l Xk T C^_|_i'U^_|_i 

else 

r = b — Axk 
break; 

end 

end 

X = X -\- Xk, r = b — Ax. 
err = \\b — Ax|/||6|| 

end 

Remark: It is easy to see that the above restarted orthogonally accumulated 
projection method is a convergent iterative scheme since the resulted error vector 
sequence produced in every restart iteration is a strictly decreasing sequence in 
terms of their Eucleadean norms. 


4 An orthogonally accumulated projection through bidiagonalization 

In this section we propose an iterative scheme similar to the OAP algorithm intro¬ 
duced in section Instead of using Lanczos-like process based on tridiagonalization 
of an unsymmetrie matrix, we show in this section that an analogous Lanczos-like 
process can also be based on bidiagonalization of unsymmetrie matrix. 
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4.1 Matrix bidiagonalization 

Any matrix A G transformed into the following bidiagonal form 


U'AV = T 


(15) 


where T is tridiagonal 


T = 


f ai j3i 0 

0 02 ^2 

0 0 0 

V 0 0 0 


0 

0 


0 \ 
0 


On —1 /3n—1 

0 On / 


both U and V are orthogonal, i.e., U'U = V'V = In- Of course this transform 
can be accomplished stably by applying Householder transformations on both 
sides of A. However a more efficient Lanczos-like process can be used to do the 
bidiagonalization. To illustrate this we rewrite equation (15) into the following 
forms 


AV = UT 


and 


a'u = vt' 

Equating k-th column of both sides of (16) and 0 we have 
Avj^ T Pk—l'^k — l ) ^ 1)2,- 

and 

^'uk = PkVk+i +akVk, k = 1,2,-■ ■ ,n-1 


n 


(16) 

(17) 

(18) 

(19) 


with /3o = 0, where and denote the fc-th columns of matrix V and U sepa¬ 
rately, Vo is a zero vector, i.e, V = [v\,V 2 , ■ ■ ■ ,r’n] and U = [ui,U 2 , ■ ■ ■ ,Un]- Espe¬ 
cially we have 

Av\ = a\u\ and A! u\ = aivi + f3iV2 (20) 

which suggests that if vi is given, then ui and V 2 can be calculated successively. 
The rest vectors Vk{k > 3) and {k > 2) can be calculated by rewriting (18) and 
(fl^ as follows 


and 


Uk — (Aufc /Ifc — i'Ufc—i) 

ak 


'^k-\-l ^ (^ '^k ^k'^k') 


( 21 ) 

( 22 ) 


The following algorithm depicts the above process. 
Algorithm 5 Let A G vi G i?” be a unit vector. 
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/3o = 0, no = 0 G i?"' 
for k = 1 to n — 1 

Pk—l'^k — 1 

= llw^fcll 

uk = wkhk 

Qk = ^'uk - OLkVk 

/3fc = Ikfcll 

t’fc+l — Qk / Pk 
end 

In exact arithmetic operations the above Lanczos-like iteration will produce 
two orthonormal vector sequences v\,V 2 , -'' Vn and ui,U 2 , - ■ 'Un with any starting 
unit vector vi, assuming no break-down happens(i.e., Pk ^ 0 for all k). Note that 
in each loop in the iteration one needs only two matrix-vector multiplications as its 
major flop counts, this makes it very effective when dealing with bidiagonalizations 
of large and sparse matrices. 


4.2 Orthogonally accumulated projection 


To develop a corresponding accumulated projection method, we need a sequence 
of orthonormal vectors {vk}i and the inner-product between each Vk and x, the 
exact solution to the system 0. Again this can be easily obtained if we choose 
a starting unit vector vi with ci = xv\ given, since we have by multiplying both 
sides of equation (22) by a; 


Cfc+I = x'vk+i = ^{x a'U k - akx'vk) = “ afcCfc), (23) 


since Ax = b. This implies that if ci is known, then all the other subsequent 
Ci{i = 2,3, ■■ ■ n) can be calculated by (23). These process can be described in the 


following algorithm, which can be viewed as an augumented Lanzcos iteration for 
solving linear system of equations. 


Algorithm 6 (orthogonally accumulated projection method-OAP2) Let A in be an un- 
symmetric and nonsingular matrix and b G a non-zero vector. Let vi be a unit vector 
and ci(= x'vi) given, where x is the solution to The following process gives the 
exact solution x to system Ax = b. 

xi = civi,Po = 0, no = 0 (g i?"") 

for k = 1 to n — 1 

Pk 2i.Vk Pk — l'^k—l 

Oik = \\Pk\\ 

Uk = Pk/Oik 
Qk ^ Uk OLk^k 
Pk = \\Qk\\ 

Uk-\-l — Qk/Pk 
Ck+i = {b'uk - OLCkP/Pk 
^k T ^k-\-lUk-\-l 

end 
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Note that there are only two matrix-vector multiplications involved, and storage 
for extra three vectors is needed besides that for the coefficient matrix A. Also the 
flop counts for each oap loop is 0(mn) in case A is sparse(having an average of m 
none-zero elements in each row) and large. 

It is also easy to verify the orthonormality of the vector sequences {vi}i and 
{ui}i in Algorithm]^ the conclusion is stated in the following. 

Theorem 41 Let A be unsymmetric and nonsingular, b G i?"' and x is the solution 
to Ax = b. The vector sequences {k = 1, 2, • • • , n) and {k = 1, 2, • • • , n)produced 
in Algorithm\^are orthonormal, assuming no breakdown happens, i.e., /3fc 7 ^ 0 for any 
A: = 1, 2, 3, ■ ■ • , n — 1. 

Proof. Apparently all vectors Vi and Ui (i = 1,2, ■ ■ ■ , n) are unit vectors. We 
first show that =0 and U 2 U\ = 0 . 

Note that 

V2V\ = 0 {A'u\ — a\v\yv\ = 0 

ai = v'lA'ui 
4^ Ctl = CKlUlUl 

the last equation holds true since ui is a unit vector. Similarly we have 

^ 21^1 = 0 {Av2 — /3ini)Vi = 0 
/3i = V2A'ui 
Pi = V2{aivi + P 1 V 2 ) 

The last equation is true since v[v 2 = 0 and V 2 is a unit vector. By induction, we 
assume vi,V 2 , - ■ ■ ,Vk and ui,U 2 , • • • ,Uk are orthonormal sequences of vectors, we 
need to show that = 0 and Uf._^iUi = 0 iov i < k. 

In fact 

u'k+iUk = 0 {Avk+i - PkUkYuk = 0 
Pk= u^,Avk+l 
^ Pk= v'kJ^i{A'Uk)' 

^ Pk= v'kj^^{PkVk + l + OikVk) 

and 

For i < k we have 

u'k+iUi = 0 ^ {Avk+i - PkUk)'ui = 0 
v'kj^^{A'ui) = 0 

(/3j'Ui-|-i A OLiVi) — 0 

The last equation holds true since by assumption we have Vk are orthogonal to Vi 
for any i < k. Similarly one can prove Uk_^_iUi = 0 for i < k. D 

To handle the issue of loss of orthogonality, a restarted orthogonally accumu¬ 
lated projection can be used, which is analogous to Algorithm and is stated 
as 

Algorithm 7 (Restarted orthogonally accumulated projection method-ROAP2) Let A G 
in be an unsymmetric and nonsingular matrix and b G R^ a non-zero vector. 
Let e(<< 1) be a given tolerance. The following procedure produces an approximation 
to the solution x to system 0 . 

err = l,r = b;x = 0 (g R^) 
while err > e 
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t = ||AV||,t;i = A'r(t,ci = b'r/t 
xi = civi,/3o = 0, uo = 0(g i?"") 
for k = 1 to n — 1 

Pk /5fc ——1 

Oik = \\Pk\\ 
ttfc+1 — Pk/0!k 
Qk ^ ^k OLJ^V]^ 

h = Ikfcll 

^^fc+1 = Qk/h 

Cfc+i Uk Oi]^C]f) jjS]^ 

e = cos~^{xk'vk/\\xk\\) 
if |7r/2 - 0| = 0 

^fc+l “1“ 

else 

break; 

end 

end 

X = X + Xk, r = b — Ax. 
err = ||6 — Aa;|/||6|| 


5 Numerical Experiments 


In this section we will examine the nnmerical behavior of the orthogonally ac- 
cnmulated projection methods proposed in previous sections. OAP methods are 
used to solve linear system of equations with unsymmetric as well as symmetric 
coefficient matrices, the results are compared with those obtained by using some 
benchmark Krylov subspace methods packaged in Matlab. In all the experiments 
we use the relative residual norm (||6—d.a;fc||/||6||) as the index for convergence, and 
the convergence tolerance is set as 10”Also the parameter “restart” of GMRES 
is always set as 5 and parameter “maximum iteration number” for GMRES is set 
as the size of each system in all the experiments. 

Example 1. Consider the following convection diffusion problem 


Au + piUx + P 2 Uy + P3U = f{x, y) 

dehned on unit square [0,1]^, which usually describes physical phenomena where 
particles, energy, or other physical quantities are transferred inside a physical sys¬ 
tem due to two processes; diffusion and convection. We use the hve point finite dif¬ 
ference method to discretize the problem, which leads to the following discretized 
equation 


2ui 


(7K? 


-i-Mj.j+i 


'^i + 1,3 Uj-ij 

2h^ 


■P2- 




2h^ 


P3Uij 


= f{xi,Xj) 

on each node point {xi,yj), where Uij = u{xi,yj), hx^ hy denote the step size on 
x-axis and y-axis direction respectively. This leads to a linear system of equation 
Ax = b with A a block tridiagonal unsymmetric matrix. 
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Table shows the comparison of iterative errors among ROAP2 and ROAP3 
and some other prevalent Krylov subspace methods. It seems that OAP methods 
produces better precision than other methods in these experiments, especially than 
that of GMRES. 


Table 1 Example 1: Comparison of relative errors 


n 

ROAP2 

ROAP3 

GMRES 

LSQR 

QMR 

BiCG 

90 

6.0659e-12 

4.8411e-8 

5.8966e-7 

1.1206e-7 

1.4894e-7 

5.9289e-8 

171 

6.1516e-9 

8.8727e-8 

1.0452e-6 

1.2838e-7 

1.5546e-7 

5.8738e-8 

361 

7.3004e-8 

1.8632e-8 

6.5894e-7 

4.1377e-8 

3.9033e-8 

3.0593e-8 

551 

1.2491e-10 

1.5095e-8 

1.0729e-6 

5.3868e-8 

1.1865e-7 

6.4840e-8 

741 

9.0775e-10 

4.3456e-9 

1.1596e-6 

5.8852e-8 

1.1784e-7 

4.2417e-8 

1131 

2.7517e-9 

1.9215e-8 

1.1474e-6 

6.1654e-8 

9.0458e-8 

3.4545e-8 

1521 

1.3374e-8 

2.4574e-8 

1.1846e-6 

2.2966e-8 

5.6823e-8 

1.9139e-8 

2401 

5.0582e-9 

7.3975e-9 

1.2118e-6 

2.4055e-8 

5.0915e-8 

2.0182e-8 


Table 2 Example 1: Comparison of iteration numbers 


n 

ROAP2 

ROAP3 

GMRES 

LSQR 

QMR 

BiCG 

90 

2 

6 

4 

77 

27 

28 

171 

2 

6 

9 

178 

44 

46 

361 

5 

6 

12 

188 

47 

47 

551 

3 

12 

19 

479 

68 

70 

741 

2 

10 

26 

744 

86 

90 

1131 

6 

8 

35 

917 

93 

96 

1521 

9 

8 

43 

764 

94 

96 

2401 

1 

8 

66 

1190 

118 

120 


Example 2. We test the Poisson problem dehned on a L-shaped domain [0,1] x 
[0, U [0, |] X [|, 1]. The resulted coefficient matrices are symmetric and positive 


dehnit. They ususlly have zero pattern shown as in Figure 2(a) and Figure 2(b) 


The comparison of relative errors among OAP and other Krylov subspace methods 
are shown in Table It seems that again OAP methods produce better precision 
than other methods in terms of relative errors. 


Table 3 Example 2: Comparison of relative errors 


n 

ROAP2 

ROAP3 

PCG 

GMRES 

LSQR 

QMR 

BiCG 

SYMMLQ 

MINRES 

200 

1.1714e-7 

9.9202e-9 

7.3111e-8 

7.1231e-7 

7.7353e-8 

1.1013e-7 

7.3111e-8 

7.3111e-8 

1.1013e-7 

500 

1.5743e-7 

1.072e-7 

2.3488e-7 

4.608e-6 

4.2399e-7 

7.5256e-7 

2.3488e-7 

2.3488e-7 

7.5256e-7 

1000 

2.9599e-7 

1.1256e-7 

4.0789e-7 

8.6828e-6 

7.2121e-7 

1.2611e-6 

4.0789e-7 

4.0789e-7 

1.2611e-6 

1400 

3.4842e-7 

2.274e-7 

6.8198e-7 

1.1024e-5 

4.9706e-7 

1.8123e-6 

6.8198e-7 

6.8198e-7 

1.8123e-6 

1700 

2.4713e-7 

5.1705e-7 

5.885e-7 

1.3928e-5 

7.4723e-7 

3.6628e-6 

5.885e-7 

5.885e-7 

3.6628e-6 

2100 

2.5727e-8 

bo 

o 

CO 

1 

5.6357e-7 

1.7001e-5 

2.8269e-7 

1.0733e-6 

5.6357e-7 

5.6357e-7 

1.0733e-6 


Example 3 We take unsymmetric tridiagonal matrix A = diag{—l, 2, —l.ljn 
as coefficient matrix, and the right hand vector b is taken such that the exact 
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(a) distribution of non zero elements of A (b) distribution of non zero elements of A 

Fig. 2 Example 2: Pattern of non-zero elements distribution 
Table 4 Example 2: Comparison of iteration numbers _ 


n 

R,OAP2 

ROAP3 

PCG 

GMRES 

LSQR 

QMR 

BiCG 

SYMMLQ 

MINRES 

200 

6 

6 

41 

8 

170 

41 

41 

40 

41 

500 

6 

12 

69 

14 

425 

68 

69 

68 

68 

1000 

13 

27 

94 

23 

826 

92 

94 

93 

92 

1400 

9 

42 

111 

34 

1149 

108 

111 

110 

108 

1700 

7 

56 

120 

39 

1381 

115 

120 

119 

115 

2100 

6 

62 

111 

44 

1052 

109 

111 

110 

109 


solution is a vector contains the function valnes of x{t) = t{l — t)e^ at grid points 
t = h : h : 1 — h, where h = 1/n. The relative errors and iterative numbers resulted 
from using OAP and other Krylov subspace methods are shown in the Table 
and Table [^respectively. Note that the coefficient matrix has very large condition 
number as n increases, and the condition numbers are listed in the second colnmn 
in Table m 


Table 5 Example 3: Comparison of relative errors 


n 

ROAP2 

ROAP3 

GMRES 

LSQR 

QMR 

BiCG 

600 

3.0413e-4 

3.0413e-4 

1.0063e-3 

3.0414e-4 

9.8330e-4 

1.1523e-3 

900 

1.6567e-4 

1.6567e-4 

5.3850e-4 

1.6569e-4 

5.2552e-4 

6.1508e-4 

1200 

1.0765e-4 

1.0765e-4 

3.4693e-4 

1.0767e-4 

3.3994e-4 

3.9558e-4 

1500 

7.7045e-5 

7.7045e-5 

2.4720e-4 

7.7080e-5 

2.4247e-4 

2.8136e-4 

1800 

5.8620e-5 

5.8620e-5 

1.8768e-4 

5.8666e-5 

1.8396e-4 

2.1319e-4 

2100 

4.6524e-5 

4.6524e-5 

1.4885e-4 

4.6581e-5 

1.4553e-4 

1.6869e-4 


Example 4 We use Matlab routine rand{) to produce coefficient matrix T, the 
right hand side vector b is taken so that the exact solution is a vector contains the 
function values of x{t) = — t)e^^ at grid points t = i * h {i = 1,2, ■ ■ ■ ,n), where 

h = 1/n. The relative errors and iterative numbers resulted from using OAP and 
other Krylov subspace methods are shown in the Table and Table [^respectively. 
We found that except LSQR, other tested methods snch as QMR,BiCG, BiCGstab 
and GMRES all fail to produce convergent resultus in these experiments. 
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Table 6 Example 3: Comparison of iteration numbers 


n 

cond(A) 

ROAP2 

ROAP3 

GMRES 

LSQR 

QMR 

BiCGstab 

600 

3.8846e-tl4 

6 

6 

600 

428 

302 

23 

900 

1.2466e-t21 

6 

6 

900 

388 

370 

23 

1200 

3.6164e-t27 

6 

6 

1200 

357 

354 

23 

1500 

1.8172e-t33 

6 

5 

1800 

316 

327 

23 

1800 

2.6357e-t39 

6 

5 

1800 

316 

327 

23 

2100 

8.0531e-t45 

6 

5 

2100 

296 

311 

23 


Table 7 Example 4: Comparison of relative residual 


n 

ROAP2 

ROAP3 

GMRES 

LSQR 

QMR 

BiCGstab 

300 

9.9465e-7 

7.5874e-7 

2.0905e-2 

9.9106e-7 

7.6749e-3 

2.1436e-2 

600 

7.6515e-7 

5.7610e-7 

1.4925e-2 

9.9968e-7 

1.5316e-2 

1.5317e-2 

900 

5.0974e-7 

8.0796e-7 

1.1373e-2 

9.9755e-7 

1.1457e-2 

1.1468e-2 


Table 8 Example 4: Comparison of iteration numbers 


n 

ROAP2 

ROAP3 

GMRES 

LSQR 

QMR 

BiCG 

300 

106 

229 

300 

536 

1498 

1 

600 

15 

52 

600 

985 

1 

1 

900 

20 

30 

900 

866 

2 

1 


6 Comments and Summary 

The OAP methods introduced in this paper still belong to the category of extended 
Krylov subspace methods since they rely on the construction of Krylov subspaces 
Km{A,v) and Km{A',v) with hxed coefhcient matrix. Although they are also de¬ 
rived from Lanczos process, just like some other Krylov subspace methods such as 
QMR, BiCG, BiCGstab, MINRES, GG; a major feature that makes OAP different 
than the other methods is the detection of loss of orthogonality is used in OAP, 
while the others usually do nothing to deal with loss of orthogonality. This might 
be the explanation of the instability of these classical Krylov subspace methods. 
Also it is easy to show the restart strategy used in OAP leads to a convergent 
iterative scheme, while restarted GMRES does not always guarantee a convergent 
process. As a matter of fact, it can be shown that GG can be viewed as a gener¬ 
alized OAP method where the orthogonality between vectors vi and is defined 
as ViAv 2 = 0 instead of = 0, thus a restart GG method can also be derived 
and is also convergent, while successful adoptionof restart strategy( which leads 
to a convergent iterative scheme) on other classical Krylov subspace methods are 
hard. 

R is also possible for us to develop accelerative schemes similar with those 
presented in[14j [15] for OAP algorithms in case of very large scale computation. 
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